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Consider the problem of estimating the 7-level set = {x : f(x) > 
7} of an unknown d-dimensional density function / based on n inde- 
pendent observations X\ , . . . , X n from the density. This problem has 
been addressed under global error criteria related to the symmetric 
set difference. However, in certain applications a spatially uniform 
mode of convergence is desirable to ensure that the estimated set is 
close to the target set everywhere. The Hausdorff error criterion pro- 
vides this degree of uniformity and, hence, is more appropriate in such 
situations. It is known that the minimax optimal rate of error con- 
vergence for the Hausdorff metric is (n/logn) _1 ^ d+2o! ' for level sets 
with boundaries that have a Lipschitz functional form, where the pa- 
rameter 01 characterizes the regularity of the density around the level 
of interest. However, the estimators proposed in previous work are 
nonadaptive to the density regularity and require knowledge of the 
parameter 01. Furthermore, previously developed estimators achieve 
the minimax optimal rate for rather restricted classes of sets (e.g., 
the boundary fragment and star-shaped sets) that effectively reduce 
the set estimation problem to a function estimation problem. This 
characterization precludes level sets with multiple connected com- 
ponents, which are fundamental to many applications. This paper 
presents a fully data-driven procedure that is adaptive to unknown 
regularity conditions and achieves near minimax optimal Hausdorff 
error control for a class of density level sets with very general shapes 
and multiple connected components. 

1. Introduction. Level sets provide useful summaries of a function for 
many applications including clustering [6, 8, 21], anomaly detection [16, 20, 
24], functional neuroimaging [12, 25], bioinformatics [27], digital elevation 
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mapping [19, 26] and environmental monitoring [22]. In practice, however, 
the function itself is unknown a priori, and only a finite number of observa- 
tions related to / are available. In this paper, we focus on the density level 
set problem; extensions to general regression level set estimation should be 
possible using a similar approach, but they are beyond the scope of this 
paper. Let Xi, . . . ,X n be independent, identically distributed observations 
drawn from an unknown probability measure P, having density / with re- 
spect to the Lebesgue measure and defined on the domain A'CI' 4 . Given a 
desired density level 7, consider the 7-level set of the density / 



The goal of the density level set estimation problem is to generate an esti- 
mate G of the level set based on the n observations {^Q}f =1 , such that the 
error between the estimator G and the target set G* as assessed by some 
performance measure which gauges the closeness of the two sets, is small. 

Most literature available on level set estimation methods [9, 13, 14, 15, 
16, 20, 23, 26] considers error measures related to the symmetric set dif- 
ference, G1AG2 = (Gi \ G2) U (G2 \ G\). However, level set methods based 
on a measure of the symmetric difference error may produce estimates that 
veer greatly from the desired level set at certain places, since the symmetric 
difference is a global measure of average closeness between two sets. Some 
applications may need a more local or spatially uniform error measure as 
provided by the Hausdorff metric, for example, to preserve topological prop- 
erties of the level set as in clustering [6, 8, 21] or ensure robustness to outliers 
in level set-based anomaly detection [16, 20, 24] and data ranking [11]. The 
Hausdorff error metric is defined as follows between two nonempty sets: 



where p(x,G) = inf yS G \\x — y\\, the smallest Euclidean distance of a point 
in G to the point x. If G\ or G2 is empty, then let doo(Gi, G2) be defined 
as the largest distance between any two points in the domain. Control of 
this error measure provides a uniform mode of convergence, as it implies 
control of the deviation of a single point from the desired set. A symmetric 
set difference-based estimator may not provide such a uniform control as it is 
easy to see that a set estimate can have a very small measure of symmetric 
difference error but large Hausdorff error. Conversely, as long as the set 
boundary is not space filling and the domain is bounded, small Hausdorff 
error implies small symmetric-difference measure. 

Existing results pertaining to nonparametric level set estimation using 
the Hausdorff metric [2,9, 23] focus on rather restrictive classes of level sets 
(e.g., the boundary fragment and star-shaped set classes). These restrictions, 
which effectively reduce the set estimation problem to a boundary function 
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estimation problem (in rectangular or polar coordinates, resp.), are typi- 
cally not met in practical applications. In particular, the characterization 
of level set estimation as a boundary function estimation problem requires 
prior knowledge of a reference coordinate or interior point (in rectangular 
or polar coordinates, resp.) and precludes level sets with multiple connected 
components. Moreover, the estimation techniques proposed in [2, 9, 23] re- 
quire precise knowledge of the regularity of the density (quantified by the 
parameter a, to be defined below) in the vicinity of the desired level in order 
to achieve minimax optimal rates of convergence. Such prior knowledge is 
unavailable in most practical applications. Recently, a plug- in method based 
on sup-norm density estimation was put forth in [3] that can handle more 
general classes than boundary fragments or star-shaped sets. However, sup- 
norm density estimation requires the density to satisfy global smoothness 
assumptions. Also, the method only deals with a special case of the density 
regularity condition considered in this paper (a = 1) and is therefore not 
adaptive to unknown density regularity. 

In this paper, we propose a plug-in procedure based on a regular histogram 
partition that can adaptively achieve minimax optimal rates of Hausdorff er- 
ror convergence over a broad class of level sets with very general shapes and 
multiple connected components, without assuming a priori knowledge of the 
density regularity parameter a. Adaptivity is achieved by a new data-driven 
procedure for selecting the histogram resolution. The procedure bears some 
similarity to Lepski-type methods [10], as further discussed in Section 3.2. 
However, our procedure is specifically designed for the level set estimation 
problem and only requires local regularity of the density in the vicinity of 
the desired level. A shorter version of this paper appeared in [17]; however, 
it relies on more stringent assumptions on the class of level sets under con- 
sideration. In this paper, we generalize the class of level sets to allow for 
spatial variations in the density regularity along the level set boundary, and 
we also discuss extensions to support set estimation and discontinuity in the 
density at all points around the level of interest. 

The paper is organized as follows. Section 2 states our basic assumptions 
which allow Hausdorff accurate level set estimation and presents a minimax 
lower bound on the Hausdorff performance of any level set estimator for 
the class of densities under consideration. In Section 3, we present the pro- 
posed histogram-based approach to Hausdorff accurate level set estimation. 
In Section 3.1, we show that the proposed estimator can achieve the mini- 
max optimal rate of convergence given knowledge of the density regularity 
parameter a, and Section 3.2 extends the estimator to achieve adaptivity 
to unknown density regularity. We also comment on extensions that address 
discontinuity in the density at the level of interest and support set estima- 
tion. Concluding remarks are given in Section 4 and the Appendices contain 
proofs of the main results. 
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2. Density assumptions. We assume that the domain of the density / is 
the unit hypercube in d dimensions, that is, X = [0, l] d . Extensions to other 
compact domains are straightforward. Furthermore, the density is assumed 
to be bounded with range [0,/ max ], though we do not assume knowledge of 
/max- Controlling the Hausdorff accuracy of level set estimates requires some 
smoothness assumptions on the density and the level set boundary, which 
are stated below. Before that, we introduce the following definitions: 

• e-ball: An e-ball centered at a point x G X is defined as 

B(x,e) = {y G X : \\x — y\\ < e}. 

Here || • || denotes the Euclidean distance. 

• Inner e- cover: An inner e-cover of a set G C X is defined as the union of 
all e-balls contained in 67. Formally, 

T £ (G)= |J B(x,e). 

x:B{x,e)CG 

We are now ready to state the assumptions. The first one characterizes 
the relationship between distances and changes in density, and the second 
one is a topological assumption on the level set boundary that essentially 
generalizes the notion of Lipschitz functions to closed hypersurfaces. 

[A] Local density regularity. The density is a-regular around the 7-level set, 
< a < 00 and < 7 < / ma x, if: 

[Al] there exist constants C\,5\ > such that for all x G X with \ f(x) — 
l\<Si, 

\f(x)- 7 \>C lP (x,dG;r, 

where dG* denotes the boundary of the true level set G*. 
[A2] there exist constants C2,d~2 > and x$ G dG* such that for all 
x G ^(xo,^), 

\f(x)- 7 \<C 2P (x,dG;r. 

This condition characterizes the behavior of the density around the level 
7. Assumption [Al] states that the density cannot be arbitrarily "flat" 
around the level, and changes as at least the ath power of the distance 
from the level set boundary. Assumption [A2] states that there exists 
a fixed neighborhood around some point on the boundary where the 
density changes no faster than the ath power of the distance from the 
level set boundary. The latter condition is only required for adaptivity, as 
we discuss later. The regularity parameter a determines the rate of error 
convergence for level set estimation. Accurate estimation is more difficult 
at levels where the density is relatively flat (large a), as intuition would 
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suggest. It is important to point out that in this paper we do not assume 
knowledge of a, unlike previous investigations into Hausdorff accurate 
level set estimation [2, 3, 9, 23]. Therefore, here the assumption simply 
states that there is a relationship between distance and density level, 
but the precise nature of the relationship is unknown. In Section 3, we 
briefly discuss extensions to address the case a = which corresponds to 
discontinuity in the density at all points around the level set boundary 
and the case 7 = which corresponds to support set estimation. 
[B] Level set regularity. There exist constants e Q > and C3 > such that 
for all e < e , T £ {G*) + 0, and for all x 6 dG* , p(x,l £ (G*)) < C 3 e. This 
assumption implies that the level set is not arbitrarily narrow anywhere. 
It precludes space-filling boundaries and features like cusps, arbitrarily 
thin ribbons and isolated connected components of arbitrarily small size. 
This condition is necessary since arbitrarily small features cannot be 
detected and resolved from a finite sample. 

For a fixed set of positive numbers C±, C2, C3, Eq, 6±, 62, /max, 7 < /max> d 
and a, we consider the following classes of densities. 

Definition 1. F\{oi) denotes the class of densities satisfying assump- 
tions [Al] and [B]. 

Definition 2. ^(ct) denotes the class of densities satisfying assump- 
tions [Al], [A2] and [B]. 

The dependence on other parameters is omitted as these do not influence 
the minimax optimal rate of convergence (except for the dimension d). In 
the paper, we present a method that provides minimax optimal rates of 
convergence for the class !F*(a), given knowledge of the density regularity 
parameter a. We also extend the method to achieve adaptivity to a for the 
class J- 2 (a), while preserving the minimax optimal performance. 

Assumption [A] is similar to the one employed in [2, 23], except that the 
upper bound assumption on the density deviation in [2, 23] holds provided 
that the set {x : \f(x) — 7] < 5\} is nonempty. This implies that the densities 
either jump across the level 7 at any point on the level set boundary (i.e., 
the deviation is greater than <5i) or change exactly as the ath power of the 
distance from the boundary. Our formulation allows for densities with reg- 
ularities that vary spatially along the level set boundary — it requires that 
the density changes no slower than the ath power of the distance from the 
boundary, except in a fixed neighborhood of one point where the density 
changes exactly as the ath power of the distance from the boundary. While 
the formulation in [2, 23] requires the upper bound on the density deviation 
to hold for at least one point on the boundary, our assumption [A2] requires 
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the upper bound to hold for a fixed neighborhood around at least one point 
on the boundary. This is necessary for adaptivity since a procedure can- 
not sense the regularity as characterized by a if the regularity only holds 
in an arbitrarily small region. Assumption [B] implies that the boundary 
looks locally like a Lipschitz function and allows for level sets with multiple 
connected components and arbitrary locations. Thus, these restrictions are 
quite mild and less restrictive than those considered in the previous litera- 
ture on Hausdorff accurate level set estimation. In fact, assumption [B] is 
satisfied by a Lipschitz boundary fragment or star-shaped set as considered 
in [2, 9, 23], as the following lemma states; please refer to [18] for a formal 
proof. 

Lemma 1. Consider the 7 level set G* of a density f £ J~SL(a), where 
J~SL{ot) denotes the class of a-regular densities with Lipschitz star-shaped 
level sets as defined in [23]. Then, G* satisfies the level set regularity as- 
sumption [B]. 

In Theorem 4 of [23], Tsybakov establishes a minimax lower bound of 
(?i/logra) _1 /( d+2a ) for the class of Lipschitz star-shaped sets, which, per 
Lemma 1, also satisfy assumption [B]. His proof uses Fano's lemma to derive 
the lower bound for a discrete subset of densities from this class. It is easy to 
see that the discrete subset of densities used in his construction also satisfy 
our form of assumption [A]. Hence, the same minimax lower bound holds 
for the classes T1 (a) and J- 2(a) under consideration as well, and we have 
the following proposition. Here E denotes expectation with respect to the 
random data sample. 

Proposition 1. There exists c > such that, for large enough n, 



The inf is taken over all set estimators G n based on the n observations. 

3. Hausdorff accurate level set estimation using histograms. Direct Haus- 
dorff estimation is challenging as there exists no natural empirical measure 
that can be used to gauge the Hausdorff error of an estimate. However, 
the density regularity assumption [A] suggests that Hausdorff control over 
the level set estimate can be obtained indirectly by controlling the density 
deviation error rather than the distance deviation. Thus, we propose a plug- 
in level set estimator that is based on an empirical density estimator, the 
regular histogram. 
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Let Aj denote the collection of cells in a regular partition of [0, l] d into 
hypercubes of dyadic sidelength 2~ 3 , where j is a nonnegative integer. The 
level set estimate at this resolution is given as 

(1) G 3 = (J A. 

AGAj : /(A)>7 

Here f(A) = P(A)/fi(A), where P{A) = ^TZ=i 1 {x l &A} denotes the empir- 
ical probability of an observation occurring in A, and /i is the Lebesgue 
measure. 

3.1. A priori knowledge of local density regularity. The appropriate res- 
olution for accurate level set estimation depends on the density regularity, 
as characterized by a, near the level of interest. If the density varies sharply 
near the level of interest (small a), then accurate estimation is easier and a 
fine resolution suffices. Identifying the level set is more difficult if the den- 
sity is very flat (large a) and, hence, a lower resolution (more averaging) is 
required. Our first result shows that if the local density regularity parame- 
ter a is known, then the correct resolution for Hausdorff accurate level set 
estimation can be chosen (as in [2, 23]), and the corresponding estimator of 
(1) achieves near minimax optimal rate over the class of densities given by 
•Fj* (a). Notice that even though the proposed method is a plug-in level set 
estimator based on a histogram density estimate, the histogram resolution 
is chosen to specifically target the level set problem and is not optimized for 
density estimation. Thus, we do not require that the density exhibits some 
smoothness at all points in the domain. We introduce the notation a n x b n 
to denote a n = 0(b n ) and b n = 0(a n ). 

Theorem 1. Assume that the local density regularity a is known. Pick 
resolution j = j(n) such that 2~ J x s n {n/ \ogn)~ l ^ d+2a \ where s n is a 
monotone diverging sequence. Then, 

sup E[doo(Gj,G*)] < Cs n ( 

fer*(a) 7 Vlogny 

for all n, where C = C(C\, C3, e a , / max , d, a) > is a constant. 

The proof is given in Appendix A and relies on two key facts. First, the 
density regularity assumption [Al] implies that the distance of any point in 
the level set estimate is controlled by its deviation from the level of interest 
7. Therefore, with high probability, only the cells near the boundary are 
erroneously included or excluded in the level set estimate. Second, the level 
set boundary does not have very narrow features — features that cannot be 
detected by a finite sample — and is locally Lipschitz as per assumption [B]. 
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This implies that the erroneous cells are not too far from the nonerroneous 
cells. Using these arguments, it is shown that the Hausdorff error scales as 
the histogram cell sidelength. 

Theorem 1 provides an upper bound on the Hausdorff error of our esti- 
mate. If s n is slowly diverging, for example, if s n = (logn) £ where e > 0, this 
upper bound agrees with the minimax lower bound of Proposition 1 up to a 
(logn) £ factor. Hence, the proposed estimator can achieve near minimax op- 
timal rates, given knowledge of the density regularity. We would like to point 
out that if the parameter 5\ characterizing assumption [A] and the density 
bound /max ar e also known, then the appropriate resolution can be chosen 
as j = Llog 2 (c _1 (n/logri) 1/ '( d+2a ))J , where the constant c = c((5i,/ max ). With 
this choice, the optimal sidelength scales as 2 _J x (n/logn) -1 /( rf+2a ), and 
the estimator Gj exactly achieves the minimax optimal rate. 

Remark 1. A dyadic sidelength is not necessary for Theorem 1 to hold, 
however the adaptive procedure described below is based on a search over 
dyadic resolutions. Thus, to present a unified analysis, we consider a dyadic 
sidelength here as well. 

3.2. Adapting to unknown local density regularity. In this section, we 
present a procedure that automatically selects the appropriate resolution 
in a purely data-driven way without assuming prior knowledge of a. The 
proposed procedure is a complexity regularization approach that is reminis- 
cent of Lepski-type methods for function estimation [10], which are spatially 
adaptive bandwidth selectors. In Lepski methods, the appropriate band- 
width at a point is determined as the largest bandwidth for which the esti- 
mate does not deviate significantly from estimates generated at finer reso- 
lutions. Our procedure is similar in spirit, however it is tailored specifically 
for the level set problem; hence, the chosen resolution at any point depends 
only on the local regularity of the density around the level of interest. 

The histogram resolution search is focused on regular partitions of dyadic 
sidelength 2~ J , j £ {0,1,..., J}. The choice of J will be specified below. 
Since the selected resolution needs to be adapted to the local regularity of 
the density around the level of interest, we introduce the following vernier: 

V-y i = min max I7 — f(A)\. 
1,3 AeAjA'eA^nA* 1 v n 

Here f{A) = P(A)/fi(A), j' = [j + log 2 s n \ , where s n is a slowly diverging 
monotone sequence, for example, logn, loglogn, etc., and Aj> n A denotes 
the collection of subcells with sidelength 2~ J ' 6 [2 _J /s n , 2 _J+1 /s n ) within 
the cell A. Observe that the vernier value is determined by a cell A G Aj 
that intersects the boundary dG*. By evaluating the deviation in average 
density from level 7 within subcells of A, the vernier indicates whether or 
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not the density in cell A is uniformly close to 7. Thus, the vernier is sensitive 
to the local density regularity in the vicinity of the desired level and leads 
to selection of the appropriate resolution adapted to the unknown density 
regularity parameter a, as we will show in Theorem 2. 

Since V 7i j requires knowledge of the unknown probability measure, we 
must work with the empirical version, defined analogously as 

V-v 7 = min max I7 — f(A')\. 

7J AeAj A'eAj/nA Jy Jl 

The empirical vernier V 7J - is balanced by a penalty term 



log(2i'( rf +Dl6/ ( 5) o log(2iW) 16/5) 

%':= max W8— — -— — — max [f(A),° 



AeAji y nfi(A) V ' nfi(A) 

where < S < 1 is a confidence parameter, and n{A) = 2 _J ' d . Notice that 
the penalty is computable from the given observations. The precise form of 
^ is chosen to bound the deviation between true and empirical vernier with 
high probability (refer to Corollary B.l for a formal proof). The final level 
set estimate is given by 

(2) G = G ? , 
where 

(3) j = arg mm { V 7ii + } . 

o<j<J 

Observe that the value of the vernier decreases with increasing resolution as 
better approximations to the true level are available. On the other hand, the 
penalty is designed to increase with resolution to penalize high complexity 
estimates that might overfit the given sample of data. Thus, the above pro- 
cedure chooses the appropriate resolution automatically by balancing these 
two terms. The following theorem characterizes the performance of the pro- 
posed complexity penalized procedure. 

Theorem 2. Pick J = J{n) such that 2~ J x s n (n/ logn)^ 1 /^, where s n 
is a monotone diverging sequence. Let j denote the resolution chosen by the 
complexity penalized method as given by (3) and G denote the final estimate 
of (2). Then, with probability at least 1 — 2/n, for all densities in the class 

/ „ \ -l/(d+2a) ~ / n \ -l/(d+2a) 

ClS W2a) ( n\ ^ C2Snsd/(d+2a) f n\ 

\ log n) \ log n J 

for n large enough [so that s n > c{Cz,e ,d)], where c\,C2 > are constants. 
In addition, 

-l/(rf+2a) 

sup E[doo(G,G*)] <Cs 2 n { 



n 
logn 
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for alln, where C = C(Ci, C2, C3, e Q , f max , fa, d, a) > is a constant. 

The proof is given in Appendix B. Observe that the maximum resolution 
2 J x s~ 1 (n/logn) 1//rf depends only on n and allows the optimal resolution 
for any a to lie in the search space. By appropriate choice of s n , for example, 
s n = (logn) £ / 2 with e a small number > 0, the bound of Theorem 2 matches 
the minimax lower bound of Proposition 1, except for an additional (logn) e 
factor. Hence, our method adaptively achieves near minimax optimal rates 
of convergence for the class J-%(oi). 

Remark 2. The case a = corresponds to jump in the density across 
the level 7, at all points along the level set boundary. The adaptive estima- 
tor can be extended to handle the complete range < a < 00 by a slight 
modification of the vernier 

V~ , = 2- j>/2 min max | 7 - f(A') I . 

This makes the vernier sensitive to the resolution even for the jump case 
and biases a vernier minimizer toward finer resolutions. The exact form 
of the modification arises from technical considerations and is somewhat 
nonintuitive. Hence, we omitted the jump case in our earlier analysis to 
keep the presentation simple. The penalty also needs to be scaled by a 
factor of 2~i'/ 2 , to ensures that balancing the vernier and penalty leads to 
the appropriate resolution for the whole range of the regularity parameter 
< a < 00. Please refer to [18] for a detailed proof. 

Remark 3. Under a measure of the symmetric difference error, it is 
known that support set estimation, that is, learning the set G$ := {x : f(x) > 
0}, is easier than level set estimation, except for the case a = (see [7, 23]). 
The same holds for Hausdorff error and the minimax rate of convergence 
can be shown to be (n/logn)~ 1 ^ d+a ^ [18]. The minimax lower bound fol- 
lows along the lines of the minimax lower bound in [23] for level set estima- 
tion (7 > 0). This rate can be achieved by the following plug-in histogram 
estimator: 

Goj = (J A 

AeAj : /(A)>0 

The analysis requires a modified theoretical analysis using Bernstein in- 
equalities rather than the relative VC inequalities we use in the proofs of 
Theorems 1 and 2 for level set estimation. Formal proofs for support set 
estimation are given in [18]. 
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4. Conclusions. In this paper, we developed a Hausdorff accurate level 
set estimation method that is adaptive to unknown density regularity and 
achieves nearly minimax optimal rates of error convergence over a more 
general class of level sets than considered in previous literature. The vernier 
provides the key to achieve adaptivity while requiring only local regularity of 
the density in the vicinity of the desired level. We also discussed extensions 
of the proposed estimator to address discontinuity in the density around the 
level of interest and support set estimation. 

While this paper considers level sets with locally Lipschitz boundaries, 
extensions to additional boundary smoothness (e.g., Holder regularity > 
1) may be possible in the proposed framework using techniques such as 
wedgelets [5] or curvelets [1]. The earlier work on Hausdorff accurate level 
set estimation [2, 9, 23] does address higher smoothness of the boundary, 
but that follows as a straightforward consequence of assuming a functional 
form for the boundary. Also, we have only addressed the density level set 
problem in this paper. Extensions to general regression level set estimation 
should be possible using a similar approach. 

The results of this paper indicate that a regular, spatially nonadaptive 
partition suffices for minimax optimal Hausdorff accurate level set estima- 
tion. However, in practice, a spatially adapted partition can provide better 
performance than a uniform partition. This is because nonuniform partitions 
can adapt to the spatial variations in density regularity to yield better es- 
timate of the boundary where the density changes sharply, even though the 
Hausdorff error is dominated by the accuracy in regions where the density 
is relatively flat at the level of interest. Thus, it is of interest to develop 
spatially adapted estimators. This might be possible by developing a tree- 
based approach or a modified Lepski method, and it is the subject of current 
research. 

APPENDIX A: PROOF OF THEOREM 1 

Before proceeding to the proof of Theorem 1, we establish three lemmas 
that will be used both in this proof and in the proof of Theorem 2. The first 
lemma bounds the deviation of true and empirical density averages. The 
choice of penalty used to achieve adaptivity is motivated by this relation. 

Lemma A.l. Consider < 5 < 1. With probability at least 1 — 6, the 
following is true for all j > 

max|/»-/(^)|<^, 

Proof. The proof relies on a pair of VC inequalities (see [4], Chapter 
3) that bound the relative deviation of true and empirical probabilities. For 
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the collection Aj with cardinality 2^ d , the relative VC inequalities imply 
that for any e > 0, with probability > 1 - 8 • 2J<V rie2 / 4 , \/A 6 A,- both 

- P(A) < e^P{A) and P(A) - P{A) < ey/pfA). 
Also, observe that 

(4) P(A) < P(A) + e\J 'P(A) P(A)<2max(P(A),2e 2 ) 
and 

(5) P(A)<P(A)+eyJP(A) => P(A) <2max(P(A),2e 2 ). 

To understand statement (4), consider the following two cases: (i) If P(A) < 
4e 2 , the statement is obvious; (ii) if P(A) > 4e 2 , this gives a bound on e 
which implies P(A) < P{A) + P{A)/2 =^ P(A) < 2P(A). Statement (5) 
follows similarly. Therefore, using (5) we get, with probability > 1 — 8 • 

\P(A) - ?{A)\ < e^2max(P(A),2e 2 ). 

Setting e = y / 41og(2^8/5 i )/n, Sj = 52^ j+1 ^ and applying union bound, we 
have with probability > 1 — 5, for all j ' > and all cells A £ Aj 

\ P( A) - P(A)\ < ^^W^A),^*™ 1 ™). 

The result follows by dividing both sides by n(A). □ 

The next lemma states how the density deviation bound or penalty 
scales with resolution j and number of observations n. 

Lemma A. 2. There exist constants 03,04 = C4(/ max , d) > such that if 
j = j(n) satisfies 2 J = 0((n/ \ogn) l / d ) , then for all n, with probability at 
least 1 — 1/n, 

C3^^<^<C 4 ^2^. 

Proof. We first derive the lower bound. Observe that since the total 
empirical probability mass is 1, we have 

1 = V P(A) < max P(A) x = max = max f(A). 
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Using this along with 5 = 1/n, j > and n{A) = 2 , we get 



L-j„logl6re 



To get the upper bound, using statement (4) from the proof of Lemma 
A.l, we have, with probability > 1 - 8 • a^e - " 62 / 4 , for all A £ Aj, P(A) < 



2max(P(A), 2e 2 ). Setting e = y / 41og(2^8/<5 j )/n, 5j = <52-^' +1 ) and applying 
union bound, we have, with probability >1 — S, for all j > and all A£ Aj, 

,log(2^ +1 'l6/5)' 



P(i) <2max P(A),8 

V n 

Dividing by fJ*(A) = 2~i d and using the density bound / maX) we get a bound 
on max^g^ f(A), which implies that, with probability > 1 — 5, 

< L, 8 Mg!^lM W /m „,W° g(2Jl " )16/f) 

y n V n 

And using 5 = 1/n and 2 J = 0((n/ logn) 1 ^), we get 



^<C 4 (/ m ax,rf)j2^^. ^ 



71 



We now analyze the performance of the plug-in histogram-based level set 
estimator proposed in (1), and establish the following lemma that bounds 
its Hausdorff error. The first term denotes the estimation error while the 
second term that is proportional to the sidelength of a cell {2~ 3 ) reflects the 
approximation error. We would like to point out that some arguments in 
the proofs hold for s n large enough. This implies that some of the constants 
in our proofs will depend on {si}^L 1 , the exact form that the sequence s n 
takes (but not on n). However, we omit this dependence for simplicity. 

Lemma A. 3. Consider densities satisfying assumptions [Al] and [B]. If 
3 = 3 \ n ) is such that 2- ? = 0{s~ l (n/\ogn) 1 / d ), where s n is a monotone di- 
verging sequence, and u>hq = reo(/ m ax> d, Si,s , Ci,a), then with probability 
at least 1 — 3/n, 



^(G^G!) < max(2C 3 + 3,8Vde" 1 ; 



iTr . \ 1/a 

- 



Proof. Let J = |~log 2 4v / <i/ / £ol , where £q IS cLS defined in assumption 
[B]. Also, define 



j 
Ci 



+ Vd2~ j 
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Consider the following two cases: 

I. j < Jo- For this case, since the domain X = [0,l] d , we use the trivial 
bound 

ikoiG^G*) <Vd<2 Ja (Vd2-i) < sVde^ey 

The last step follows by choice of Jo and since ^j, C\ > 0. 
II. j > Jq. Observe that assumption [B] implies that G* is not empty since 
G* 3 I e (G!) 7^ for e < e Q . We will show that for large enough n, with 

high probability, Gj flG* /0 for j > Jq, and hence Gj is not empty. 
Thus, the Hausdorff error is given as 




and we need bounds on the two terms in the right-hand side. 

To prove that Gj is not empty and obtain bounds on the two terms 
in the Hausdorff error, we establish a proposition and corollary. In the 
following analysis, if G = 0, then we define sup x£ Q g(x) = for any func- 
tion <?(•). The proposition establishes that for large enough n, with high 
probability, all points whose distance to the boundary dG* is greater 
than £j are correctly excluded or included in the level set estimate. 

Proposition 2. If j = j(n) is such that 2 J = 0(s~ 1 (n/\ogn) 1 l d ), and 
n > n i(/max> d, 5\), then with probability at least 1 — 2/n, 

sup p(x, 8G*) < -M + Vd2~:> = e 
xeGjAG* KCl/ 

Proof. If GjAG* = 0, then sup^g A£? , p(x,dG*) = by definition, 

and the result of the proposition holds. If GjAG* ^ 0, consider x G GjAG*- 
Let A x £ Aj denote the cell containing x at resolution j. Consider the fol- 
lowing two cases: 

(i) A x n dG: f / 0. This implies that p(x, dG*) < y/d2~i . 

(ii) A x n dG* = 0. Since x £ GjAG*, it is erroneously included or ex- 
cluded from the level set estimate Gj. Therefore, if f(A x ) > 7, then f(A x ) < 
7 and if f(A x ) < 7, then f(A x ) > 7. This implies that \-f-f(A x )\ < \f(A x )- 
f(A x )\. Using Lemma A.l, we get I7 — f(A x )\ < ^/j with probability at least 
1-5. 

Now let x\ be any point in A x such that I7 — f{x\)\ < I7 — f(A x )\. (No- 
tice that at least one such point must exist in A x since this cell does not 
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intersect the boundary.) As argued above, I7 — f(A x )\ < fyj with probability 
at least 1 — 1/n (for 6 = 1/n). Using Lemma A. 2, for resolutions satisfying 
2° = 0(s~ 1 (n/logn) 1 / d ) and for large enough n > ni(/ max , d, 81), VPj < Si; 
hence, I7 — f(xi)\ < 8\, with probability at least 1 — 1/n. Thus, the density 
regularity assumption [Al] holds at x\ with probability > 1 — 2/n, and we 
have 

, (X1 , 8G;) < (h^M)* < (hziiM) 1 '" < (gV" 



Since x, xi E j4 



.r • 



8G*) < p(zi, + Vd2~ j < Q£) + Vd2->. 

So for both cases, if j =j(n) is such that 2 J = 0(s~ 1 (n/logn) 1 / <i ) ) and 
n > ni(f max ,d,Si), then with probability at least 1 — 2/n, Vx G GjAG*, 
p(x, dG*) < (^/Ci) 1 / + >/d2-^ = e,-. □ 

Based on Proposition 2, the following corollary argues that for large 
enough n and j > Jq = |~log 2 Ay/d/eo] , with high probability, all points within 
the inner cover Z 2ei (Cr*) that are at a distance greater than Ej are correctly 

included in the level set estimate; hence, they lie in GjHG*. This also implies 
that Gj is not empty. 

Corollary 1. Recall assumption [B] and denote the inner cover of 
G* with 2ej -balls, Z 2ej (G*) =T<i Si for simplicity. If j = j(n) is such that 
2 J = 0(s~ 1 (n/\ogn) l/d ), j > Jq, andn> no = n (f iaax .,d,8 1 ,£ , Ci,a), then 
with probability at least 1 — 3/n, 

Gj / and sup p(x, Gj n G*) < Ej. 



PROOF. Observe that for j > J , 2Vd2~ j < 2yfd2~ Jo < e a /2. By Lem- 
ma A. 2, for resolutions satisfying 2- ? = 0(s~ 1 (n/logn) 1 / rf ), and for large 
enough n > n 2 (e , /max, Ci,a), 2(^j/Ci) 1 / Q < e /2, with probability at least 
1 — 1/n. Therefore, for resolutions satisfying 2 J = 0(s~ 1 (n/logn) 1 / <i ) and 
j > Jq, and for n > n 2 , with probability at least 1 — 1/n, 2Ej < e a and hence 
T 2£j + 0. 

Now consider any 2e ? -baTl in Z 2e .. Then the distance of all points in 
the interior of the concentric Sj-ball from the boundary of X 2e ., and hence 
from the boundary of G*, is greater than Ej. As per Proposition 2, for 
n> uq = max(ni,n 2 ) with probability > 1 — 3/n, none of these points can 



16 A. SINGH, C. SCOTT AND R. NOWAK 

lie in GjAG*; hence, they must lie in Gj n G* since they are in Xi e . C G*. 
Thus, Gj^0, and for all x G T 2Ej , p(x, Gj n 67*) <Ej. □ 

We now resume the proof of Lemma A. 3, case II. Assume the conclusions 
of Proposition 2 and Corollary 1 hold. Thus, all the following statements 
hold for resolutions satisfying 2 J = 0(s~ 1 (n/ logn) 1 /^), j > Jq and n > no = 
no(/ max , d, 5i, e , C\, a), with probability at least 1 — 3/n. Since 67* and Gj 
are nonempty sets, we now bound the two terms that contribute to the 
Hausdorff error. To bound the term sup^g p(x,G*), observe that 

sup p(x,67*) = sup p(x,G*)= sup p(x,dG*) 

x&Gj x£Gj\G* xeGj\G* 

(7) 

< sup p(x,dG y ) < Ej, 

xeGjAG* 

where the last step follows from Proposition 2. 

To bound the term sup xeG * p(x, Gj), we recall assumption [B] which states 
that the boundary points of G* are 0(ej) from the inner cover l2 £j (G*), 
and we use Corollary 1 to bound the distance of the inner cover from Gj as 
follows: 

sup p(x,Gj) < sup p(x,Gj flG*) 
xeG* xeG* 

(8) = max] sup p(x, Gj n 67*), sup p(x, Gj n G*) \ 

< max] Ej , sup p(x, Gj n G*) \ , 

1 x£G*\l 2ej J 

where the last step follows from Corollary 1. 

Now consider any x £ G*\ T2 £j ■ By the triangle inequality, Vy G dG* and 

G l2 £j , 

p(x, Gj n G*) < p(x, y) + p(y, z) + p(z, Gj n G*) 

< p{x,y) + p(y,z) + sup p{z',GjV\G*) 

< p{x,y) + p(y,z) + Ej, 

where the last step follows from Corollary 1. This implies that, Vy G dG* , 

p(x, Gj n 67* ) < p(x, y) + inf p(y,z)+£j 
= p(x,y)+p(y,l 2£j )+£j 
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<p(x,y) + sup p(y',l 2 e j ) + £j 
y'edG* 

< p(x,y)+2C 3 £j + ej, 

where the last step invokes assumption [B]. This, in turn, implies 

p(x, Gj n G*) < inf p{x, y) + (2C 3 + l)e,- < 2e,- + (2C 3 + l)e,-. 

y£dG* 

The second step is true for x G G* \ X2 £j , because if it was not true then 
Vy G <9G*, p(x,y) > 2ej; hence, there exists a closed -ball around x that 
is in G*. This contradicts the fact that x ^X2 £j . Therefore, we have 

sup p(x, Gj n G*) < (2C 3 + 3) £j . 

xeG*\r 2e . 

And going back to (8), we get 

(9) sup p{x,G j )<(2C 3 + 3)e j . 

From (7) and (9), we have that for all densities satisfying assumptions [Al] 
and [B], if j =j(n) is such that 2 J = 0{s~ l (n/\ogn) l / d ), j > Jo and n > 
5i,e ,Ci,a), then with probability > 1 — 3/n, 

doo(Gj,GZ) = maxj sup p(x,Gj), sup p(x,G*) \ < (2C 3 + 3)e,-. 

And addressing both case I (j < Jo) and case II (j > Jo), we finally have 
that for all densities satisfying assumptions [Al] and [B], if j = j(n) is such 
that 2 J ' = 0(s~ l {n/ log n) 1 ^), and n > n = n (/ m ax, d, 5\, £ , C\, a), then 
with probability > 1 — 3/n, 

d^G^G*) < max(2C 3 + 3, sVde-^ej. 
This completes the proof of Lemma A. 3. □ 

We now establish the result of Theorem 1. Since the local density regu- 
larity parameter a is known, the appropriate histogram resolution can be 
chosen as 2~ J x s n {n/ \ogn)~~ l ^ d+2a \ Let Q. denote the event such that the 
bounds of Lemma A. 2 (with 5 = 1 jn) and Lemma A. 3 hold. Then for n > no, 
P(fl) < 4/n, where Cl denotes the complement of 0. For n < no, we can use 
the trivial inequality P(&) < 1. So we have, for all n, P(&) < max(4,no)^ =: 
C'\. Here C = C"(/ max , d, 6i,e , Ci, a). So V/ G T\ (a), we have the follow- 
ing. (Explanation for each step is provided after the equations.) 

E[d^(G v G;)} = p(n)E[d^(G j ,G;)\n}+p(n)E[d^(G j ,G;)\n] 
<E[d 00 (G j ,G*)\n\ + p(n)Vd 
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<max(2C7 3 + 3,8Vde- 1 ; 



j 
Ci 



+ Vd2~ j 



n 



< Cm „ i|2 ^v /<2 ' ,) , 2 -i,i 

n J n 

\ -l/{d+2a) , N -l/(d+2a) 



<Cs n 



log n J \ log n J n 



\ -l/(d+2a) 



logn 



Here C = C{C\, C3, e Q , / max , <5i, d, a). The second step follows by observing 
the trivial bounds P(Q) < 1 and E[doo(dj, G*)\Cl) < \Q since the domain 
X = [0, l] d . The third step follows from Lemma A. 3 and the fourth one 
using Lemma A. 2. The fifth step follows since the chosen resolution 2~' J x 
s n (n/ log n )-W+2a)_ 

APPENDIX B: PROOF OF THEOREM 2 

To analyze the resolution chosen by the complexity penalized procedure of 
(3) based on the vernier, we first establish two results regarding the vernier. 
Using Lemma A.l, we have the following corollary that bounds the deviation 
of true and empirical vernier. 



Corollary B.l. Consider < 5 < 1. With probability at least 1 — 5, 
the following is true for all j > 0: 

|V 7li -V 7 j|<%. 

Proof. Let Aq G Aj denote the cell achieving the minimum defining 
V 7 j and A± £ Aj denote the cell achieving the minimum defining V 7 j. Also, 
let A' q and A' w denote the subcells at resolution j' within Aq and A\, 
respectively, that have maximum average density deviation from 7. Similarly, 
let A' 01 and A' tl denote the subcells at resolution j' within Aq and A\, 
respectively, that have maximum empirical density deviation from 7. Then, 
we have 

V 7 j - %j = 17 - 7Ko)l - 17 - KAx)\ 

< | 7 - f(A' 10 )\ - | 7 - f(A' u )\ < \f(A> 10 ) - /(A' u )| 

< max{/(A' 10 ) - /(ii„),/(Ai 1 ) - f(A' n )} 

< max \f(A)-f(A)\<V f . 
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The first inequality invokes definition of Aq, the third inequality invokes 
definitions of the subcells A' w , A' u and the last one follows from Lemma 
A.l. The bound on V 7 j — V 7 j follows similarly. □ 

The second result establishes that the vernier is sensitive to the resolution 
and density regularity 

Lemma B.l. Consider densities satisfying assumptions [A] and [B]. 
Recall that j' = [j + log 2 s n \ , where s n is a monotone diverging sequence. 
There exists C = C{Ci, /max, fa, ct) > such that if n is large enough so that 
s n > 8max(3e„ 1 ,28, 12C 3 )Vd, then for all j > 0, 

min((5i,Ci)2- J " a < V ltj < C(Vd2~ j ) a . 

Proof. We first establish the upper bound. Recall assumption [A] and 
consider the cell Aq G Aj that contains the point xq. Then, Aq n dG* ^ 0- 
Let Aq denote the subcell at resolution f within Aq that has maximum 
average density deviation from 7. Consider the following two cases: 

(i) If the resolution is high enough so that \fd~2~i < 82, then the density 
regularity assumption [A2] holds Vx G Aq since Aq C B^xqjSz), the (52-ball 
around xq. The same holds also for the subcell A' . Hence, 

|7-/K)|<C 2 (v^2-^) Q 

(ii) If the resolution is not high enough and \Td2~~i > § 2j use the following 
trivial bound: | 7 - f(A' )\ < / max < ^(y/d2^) a . 

Hence, we can say for all j there exists a cell Aq G Aj such that 

max | 7 - f(A')\ = | 7 - f(A' )\ < max(c 2 , ^) (Vd2^) a . 

This yields the upper bound on the vernier, Vj,j < C{Vd2-i) a , where C = 

C(C 2 , f max., fa, a). 

For the lower bound, consider any cell A G Aj. We will show that the level 
set regularity assumption [B] implies that for large enough n (so that the 
sidelength 2~° is small enough), the boundary does not intersect all subcells 
at resolution j' within the cell A at resolution j. In fact, there exists at least 
one subcell A\ G A n Aj> such that \/x G A\ , 

P (x,dG;)>2- j '. 

We establish this statement formally later on, but for now assume that it 
holds. The local density regularity condition [A] now gives that for all x £ A[, 
|7-/(x)|>min($i,Ci2-'' a ) > min(J 1 , d)2-i' a . So we have 

A ,max ( | 7 - f(A')\ > | 7 - f(A[)\ > min(* 1 ,Ci)2^"«. 
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Since this is true for any A 6 Aj, in particular, this is true for the cell 
achieving the minimum defining V 7J -. Hence, the lower bound on the vernier 
V 7i j follows. 

We now formally prove that the level set regularity assumption [B] implies 
that for large enough n (so that s n > 8max(3e~ 1 , 28, 12C , 3)v / d), 3A[ £ An 
Aj' such that Vx S A[ , 

p{x,dG;)>2-i'. 

Observe that if we consider any cell at resolution j" := f — 2 that does not 
intersect the boundary dG*, then it contains a cell at resolution j' that is 
greater than 2 _J ' away from the boundary. Thus, it suffices to show that for 
large enough n [so that s n > 8max(3e~ 1 , 28, 12C^)\fd}, 3A" G AnAf such 
that A" n dG* = 0- We prove the last statement by contradiction. Suppose 
that for s n > 8max(3e~ 1 , 28, 12C^)y/d, all subcells in A at resolution j" 
intersect the boundary dG* . Let e = 3\/d2 - -? . Then, 

e = 3Vd2-J" = nVd2-r < ^2- < < e a , 

where the last step follows since s n ^ 2A^J~ds ^. By choice of every closed 
e-ball in A must contain an entire subcell at resolution j" and in fact must 
contain an open neighborhood around that subcell. Since the boundary in- 
tersects all subcells at resolution j" , this implies that every closed e-ball in 
A contains a boundary point and in fact contains an open neighborhood 
around that boundary point. Thus, (i) every closed e-ball in A contains 
points not in 67*, and hence cannot lie in I £ (G*). Also, observe that since 
all subcells in A at resolution j" intersect the boundary of G*, (ii) there 
exists a boundary point x\ that is within \J~d2~i" of the center of cell A. 
From (i) and (ii) it follows that 

p( Xl , l e (G*)) >^y~ \/d2- J " - 2e = ^ - 28Vd2~ j ' 

>-G-^)>?> 

where the last step follows since s n > 224\/~d. However, assumption [B] im- 
plies that for e < e Q , 

pixMG*)) < C 3 e = 3C 3 Vd2-i" = UC 3 Vd2^' < 2AC ^ 2 ' 3 < ^2, 

7 Sn 4 

where the last step follows since s n > 96CsVd, and we have a contradiction. 
This completes the proof of Lemma B.l. □ 
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We are now ready to prove Theorem 2. To analyze the resolution j chosen 
by (3), we first derive upper bounds on V -j and ^ that effectively char- 
acterize the approximation error and estimation error, respectively. Thus, a 
bound on the vernier V ■J will imply that the chosen resolution j cannot be 
too coarse, and a bound on the penalty will imply that the chosen resolution 
is not too fine. Using Corollary B.l and (3), we have the following oracle 
inequality that holds with probability at least 1 — 5: 

1,0 - 7,3 3 1 0<j<J 1 ' 3 3 0<j<J 3 

Lemma B.l provides an upper bound on the vernier V 7 j, and Lemma A. 2 
provides an upper bound on the penalty ^ y . We plug these bounds into the 
oracle inequality. Here C may denote a different constant from line to line. 
With probability at least 1 — 2/n (with 5 = 1/re), 



V - < V < C min J maxf 2~ JQ , J 2-7 rf s£^^ 

7,3 ~ 7,3 3'- 0<j<j[ I V n n 

-a/(d+2a) 



< Q s da/(d+2a) 



n 



logn 



Here C = C(C%, /max, $2, d, a). The first step uses the definition of j' , and 
the second step follows by balancing the two terms for optimal resolution 
j* given by 2~ 3 * x Sn + (re/ 'logre) _1 /( rf+2a ). This establishes the desired 

bounds on V -> and 

7,3 3 

Now, using Lemma B.l and the definition of j' , we have the following 
upper bound on the chosen sidelength. For s n > 8max(3e~ 1 , 28, 12Cs)y/d, 

/ V -> \ V" / -n \ -l/(^+2a) 



min(5i,Ci)/ V log re 

where c 2 = ci{C\, C2, /max, <5i , 82, d, a) > 0. Also notice that since 2 J x s" 1 
(re/ '\ogn) l / d , we have 2 J < s n 2 J < s n 2 J x (n/logn) 1 ^, and thus j' satisfies 
the condition of Lemma A. 2. Therefore, using Lemma A. 2, we get a lower 
bound on the sidelength. With probability at least 1 — 2/re, 

„ \ -l/d / _ \ -l/(d+2a) 



2 Ul logrey* " l n Uo 



: re 



where ci = c\(C2, /max, ^2, ^, a) > 0. So we have for s n > 8max(3e , 28, 
12Cz)\/d, with probability at least 1 — 2/re, 



/ „ \ -l/(d+2a) _ / _ \ -l/(d+2a) 

(10) Cl 4 /(d+2a) fr^) <2^<c 2Sri 4 /(<i+2a) f7^) 
V log re / V log re / 
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where ci = ci(C 2 , /max, #2, d, a) >0 and c 2 = c 2 (Ci, C 2 , / max , <Si, £2, d, a) > 0. 
Hence, the automatically chosen resolution behaves as desired. 

Now we can invoke Lemma A. 3 to derive the rate of convergence for 
the Hausdorff error. Consider large enough n > ni(C3,e ,d) so that s n > 
8max(3e~ 1 , 28, 12Cs)Vd- Also, recall that the condition of Lemma A. 3 
requires that n > no(/ m ax, d, Si, e a , C\,a). Pick n > max(no,ni) and let 
f2 denote the event such that the bound of Lemma A. 3 and the upper 
and lower bounds on the chosen resolution in (10) hold. Then, we have 
P(fl) < 5/n. For n < max(rao, ni), we can use the trivial inequality P(&) < 
1. So we have, for all n, P(O) < max(5, max(no, n i))n =: C— . Here C = 
C(Ci, C3, e G , / ma x, Si, d, a). So V/ G JT|(a), we have the following. (Here C 
may denote a different constant from line to line. Explanation for each step 
is provided after the equations.) 



E[doo (67, 6?;)] = P(0)E[d 0O (G,G*)\n] + P(Q)E[doo (G, G*) 
<E[d 00 (G,GZ)\Q}+ P(n)Vd 



<c 



CiJ n . 



n J n 



<Cmaxi S l- d2 /2-)/d+2a 



1? 



-l/(d+2o) 



d/(d+2a) 



logn 

l/(d+2a) ]_ 



logn 



<Csi 



n 



logn 



-l/(d+2a) 



Here C= C(Ci,C 2 ,C3,e ,/ max ,(5i,(5 2 ,(i, a). The second step follows by ob- 
serving the trivial bounds -P(fi) < 1 and since the domain X = [0, (G, 
G*)|fi] < yd. The third step follows from Lemma A. 3 and the fourth one 
from Lemma A. 2. The fifth step follows using the upper and lower bounds 
established on 2 _J in (10). 
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